An Enhanced Memetic Algorithm for Feature Selection in Big Data Analytics with MapReduce

نویسندگان

چکیده

Recently, various research fields have begun dealing with massive datasets forseveral functions. The main aim of a feature selection (FS) model is to eliminate noise, repetitive, and unnecessary featuresthat reduce the efficiency classification. In limited period, traditional FS models cannot manage filterunnecessary features. It has been discovered from state-of-the-art literature that metaheuristic algorithms perform better compared other wrapper-based techniques. Common techniques such as Genetic Algorithm (GA) andParticle Swarm Optimization (PSO) algorithm, however, suffer slow convergence local optima problems. Even new generation Firefly heuristic Fish Heuristic, these questions shown overcome. This paper introduces an improved memetic optimization (EMO) algorithm for in this perspective by using conditional criteria large datasets. proposed EMO divides entire dataset into sample blocksandconducts task learning map steps. partial result obtained combined final vector weights reductionprocess which defines appropriate collection characteristics. Finally, method grouping based on support machine (SVM) takes place. Within Spark system, applied experimental results claim it superior approaches. simulation show maximum AUC values 0.79 0.74 respectively are EMO-FS model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

Evolutionary Feature Selection for Big Data Classification: A MapReduce Approach

Nowadays, many disciplines have to deal with big datasets that additionally involve a high number of features. Feature selection methods aim at eliminating noisy, redundant, or irrelevant features that may deteriorate the classification performance. However, traditionalmethods lack enough scalability to copewith datasets ofmillions of instances and extract successful results in a delimited time...

متن کامل

Feature Selection in Structural Health Monitoring Big Data Using a Meta-Heuristic Optimization Algorithm

This paper focuses on the processing of structural health monitoring (SHM) big data. Extracted features of a  structure are reduced using an optimization algorithm to find a minimal subset of salient features by removing noisy, irrelevant and redundant data. The PSO-Harmony algorithm is introduced for feature selection to enhance the capability of the proposed method for processing the  measure...

متن کامل

Hadoop Mapreduce Framework in Big Data Analytics

As Hadoop is a Substantial scale, open source programming system committed to adaptable, disseminated, information concentrated processing. Hadoop [1] Mapreduce is a programming structure for effectively composing requisitions which prepare boundless measures of information (multi-terabyte information sets) inparallel on extensive bunches (many hubs) of merchandise fittings in a dependable, sho...

متن کامل

An Improved Flower Pollination Algorithm with AdaBoost Algorithm for Feature Selection in Text Documents Classification

In recent years, production of text documents has seen an exponential growth, which is the reason why their proper classification seems necessary for better access. One of the main problems of classifying text documents is working in high-dimensional feature space. Feature Selection (FS) is one of the ways to reduce the number of text attributes. So, working with a great bulk of the feature spa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Intelligent Automation and Soft Computing

سال: 2022

ISSN: ['2326-005X', '1079-8587']

DOI: https://doi.org/10.32604/iasc.2022.017123